Reducing Overdetections in a French Symbolic Grammar Checker by Classification
Identifieur interne : 000330 ( Main/Exploration ); précédent : 000329; suivant : 000331Reducing Overdetections in a French Symbolic Grammar Checker by Classification
Auteurs : Fabrizio Gotti [Canada] ; Philippe Langlais [Canada] ; Guy Lapalme [Canada] ; Simon Charest [Canada] ; Éric Brunelle [Canada]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2011.
English descriptors
- Teeft :
- Agreement error, Analytix, Certain detections, Checker, Computational, Computational linguistics, Corrections, Decision tree, Decision trees, Detection, Detection types, Dont, Druide, Druide informatique, Eventual implementation, Faire partie, False positives, Good corrections, Good detection, Grammar, Grammar checker, Grammar checkers, Grammar engineering, Grammatical category, Head word, Interesting challenges, Large community, Legitimate detections, Manual inspection, Natural language processing, Noun adjunct, Other detections, Overdetection, Overdetections, Parser, Past participle, Poor quality, Rule induction, Same sentence, Scatter plot cluster, Scorali, Sofkova hashemi, Syntactical parse, Training data, True positives.
Abstract
Abstract: We describe the development of an “overdetection” identifier, a system for filtering detections erroneously flagged by a grammar checker. Various families of classifiers have been trained in a supervised way for 14 types of detections made by a commercial French grammar checker. Eight of these were integrated in the most recent commercial version of the system. This is a striking illustration of how a machine learning component can be successfully embedded in Antidote, a robust, commercial, as well as popular natural language application.
Url:
DOI: 10.1007/978-3-642-19437-5_32
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 000F87
- to stream Istex, to step Curation: 000F09
- to stream Istex, to step Checkpoint: 000191
- to stream Main, to step Merge: 000330
- to stream Main, to step Curation: 000330
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Reducing Overdetections in a French Symbolic Grammar Checker by Classification</title>
<author><name sortKey="Gotti, Fabrizio" sort="Gotti, Fabrizio" uniqKey="Gotti F" first="Fabrizio" last="Gotti">Fabrizio Gotti</name>
</author>
<author><name sortKey="Langlais, Philippe" sort="Langlais, Philippe" uniqKey="Langlais P" first="Philippe" last="Langlais">Philippe Langlais</name>
</author>
<author><name sortKey="Lapalme, Guy" sort="Lapalme, Guy" uniqKey="Lapalme G" first="Guy" last="Lapalme">Guy Lapalme</name>
</author>
<author><name sortKey="Charest, Simon" sort="Charest, Simon" uniqKey="Charest S" first="Simon" last="Charest">Simon Charest</name>
</author>
<author><name sortKey="Brunelle, Eric" sort="Brunelle, Eric" uniqKey="Brunelle E" first="Éric" last="Brunelle">Éric Brunelle</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:98FD0CC0ED5CE29441F66EB69934609BA2E8B9B8</idno>
<date when="2011" year="2011">2011</date>
<idno type="doi">10.1007/978-3-642-19437-5_32</idno>
<idno type="url">https://api.istex.fr/document/98FD0CC0ED5CE29441F66EB69934609BA2E8B9B8/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000F87</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Corpus" wicri:corpus="ISTEX">000F87</idno>
<idno type="wicri:Area/Istex/Curation">000F09</idno>
<idno type="wicri:Area/Istex/Checkpoint">000191</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Checkpoint">000191</idno>
<idno type="wicri:doubleKey">0302-9743:2011:Gotti F:reducing:overdetections:in</idno>
<idno type="wicri:Area/Main/Merge">000330</idno>
<idno type="wicri:Area/Main/Curation">000330</idno>
<idno type="wicri:Area/Main/Exploration">000330</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Reducing Overdetections in a French Symbolic Grammar Checker by Classification</title>
<author><name sortKey="Gotti, Fabrizio" sort="Gotti, Fabrizio" uniqKey="Gotti F" first="Fabrizio" last="Gotti">Fabrizio Gotti</name>
<affiliation wicri:level="1"><country xml:lang="fr">Canada</country>
<wicri:regionArea>DIRO, Univ. de Montréal, Succ Centre-Ville, C.P. 6128, H3C 3J7, Montréal, Québec</wicri:regionArea>
<wicri:noRegion>Québec</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Langlais, Philippe" sort="Langlais, Philippe" uniqKey="Langlais P" first="Philippe" last="Langlais">Philippe Langlais</name>
<affiliation wicri:level="1"><country xml:lang="fr">Canada</country>
<wicri:regionArea>DIRO, Univ. de Montréal, Succ Centre-Ville, C.P. 6128, H3C 3J7, Montréal, Québec</wicri:regionArea>
<wicri:noRegion>Québec</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Lapalme, Guy" sort="Lapalme, Guy" uniqKey="Lapalme G" first="Guy" last="Lapalme">Guy Lapalme</name>
<affiliation wicri:level="1"><country xml:lang="fr">Canada</country>
<wicri:regionArea>DIRO, Univ. de Montréal, Succ Centre-Ville, C.P. 6128, H3C 3J7, Montréal, Québec</wicri:regionArea>
<wicri:noRegion>Québec</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Charest, Simon" sort="Charest, Simon" uniqKey="Charest S" first="Simon" last="Charest">Simon Charest</name>
<affiliation wicri:level="1"><country xml:lang="fr">Canada</country>
<wicri:regionArea>Druide Informatique, 1435 rue Saint-Alexandre, bureau 1040, H3A 2G4, Montréal, Québec</wicri:regionArea>
<wicri:noRegion>Québec</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Brunelle, Eric" sort="Brunelle, Eric" uniqKey="Brunelle E" first="Éric" last="Brunelle">Éric Brunelle</name>
<affiliation wicri:level="1"><country xml:lang="fr">Canada</country>
<wicri:regionArea>Druide Informatique, 1435 rue Saint-Alexandre, bureau 1040, H3A 2G4, Montréal, Québec</wicri:regionArea>
<wicri:noRegion>Québec</wicri:noRegion>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2011</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="Teeft" xml:lang="en"><term>Agreement error</term>
<term>Analytix</term>
<term>Certain detections</term>
<term>Checker</term>
<term>Computational</term>
<term>Computational linguistics</term>
<term>Corrections</term>
<term>Decision tree</term>
<term>Decision trees</term>
<term>Detection</term>
<term>Detection types</term>
<term>Dont</term>
<term>Druide</term>
<term>Druide informatique</term>
<term>Eventual implementation</term>
<term>Faire partie</term>
<term>False positives</term>
<term>Good corrections</term>
<term>Good detection</term>
<term>Grammar</term>
<term>Grammar checker</term>
<term>Grammar checkers</term>
<term>Grammar engineering</term>
<term>Grammatical category</term>
<term>Head word</term>
<term>Interesting challenges</term>
<term>Large community</term>
<term>Legitimate detections</term>
<term>Manual inspection</term>
<term>Natural language processing</term>
<term>Noun adjunct</term>
<term>Other detections</term>
<term>Overdetection</term>
<term>Overdetections</term>
<term>Parser</term>
<term>Past participle</term>
<term>Poor quality</term>
<term>Rule induction</term>
<term>Same sentence</term>
<term>Scatter plot cluster</term>
<term>Scorali</term>
<term>Sofkova hashemi</term>
<term>Syntactical parse</term>
<term>Training data</term>
<term>True positives</term>
</keywords>
</textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: We describe the development of an “overdetection” identifier, a system for filtering detections erroneously flagged by a grammar checker. Various families of classifiers have been trained in a supervised way for 14 types of detections made by a commercial French grammar checker. Eight of these were integrated in the most recent commercial version of the system. This is a striking illustration of how a machine learning component can be successfully embedded in Antidote, a robust, commercial, as well as popular natural language application.</div>
</front>
</TEI>
<affiliations><list><country><li>Canada</li>
</country>
</list>
<tree><country name="Canada"><noRegion><name sortKey="Gotti, Fabrizio" sort="Gotti, Fabrizio" uniqKey="Gotti F" first="Fabrizio" last="Gotti">Fabrizio Gotti</name>
</noRegion>
<name sortKey="Brunelle, Eric" sort="Brunelle, Eric" uniqKey="Brunelle E" first="Éric" last="Brunelle">Éric Brunelle</name>
<name sortKey="Charest, Simon" sort="Charest, Simon" uniqKey="Charest S" first="Simon" last="Charest">Simon Charest</name>
<name sortKey="Langlais, Philippe" sort="Langlais, Philippe" uniqKey="Langlais P" first="Philippe" last="Langlais">Philippe Langlais</name>
<name sortKey="Lapalme, Guy" sort="Lapalme, Guy" uniqKey="Lapalme G" first="Guy" last="Lapalme">Guy Lapalme</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Sarre/explor/MusicSarreV3/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000330 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000330 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Sarre |area= MusicSarreV3 |flux= Main |étape= Exploration |type= RBID |clé= ISTEX:98FD0CC0ED5CE29441F66EB69934609BA2E8B9B8 |texte= Reducing Overdetections in a French Symbolic Grammar Checker by Classification }}
This area was generated with Dilib version V0.6.33. |